On Translation Invariant Kernels and Screw Functions 



Purushottam Kar and Harish Karnick 
Department of Computer Science and Engineering 
Indian Institute of Technology Kanpur 
{purushot ,hk}@cse . iitk. ac . in 

February 19, 2013 



Abstract 

We explore the connection between Hilbertian metrics and positive definite kernels on the real 
line. In particular, we loo k at a well-known cha racterization of translation invariant Hilbertian 
metrics on the real line by Ivon Neumann and Sc hocnbcrg (1941). Using this result we are able 
to give an altern ate proof of B ochner's theorem for translation invariant positive definite kernels 
on the real line ( Rudinl . Il962l ). 



1 Introduction 

We start off with a few definitions and set the notation. In the foUowing discussion, shall 
denote the real Hilbert space, small case letters x,y, . . . shall denote real numbers, boldface small 
letters x, y, . . . shall denote objects in arbitrary domains. A metric D defined on some domain 
Q will be said to be Hilbertian if there exists a map : Q ^ Sj such that for all x,y € $7, 
D{x,y) = ll^(x) — C(y)||j^- A symmetric real valued kernel K : Q x il. ^ W will be said to be 
positive definite if for any n E N, any x^^, . . . , x^ G if we consider the matrix G}{ — [^^(x^, Xj)], 
then for any c G M", we have c^Gkc > 0. A symmetric real valued kernel N : Q x Q ^ W will be 
said to be negative definite if whenever c^l = 0, we have c^Gnc < (here 1 G is the vector of 
all ones). It is easy to verify that all squared Hilbertian metrics are negative definite. 

There exists a close connection between negative definite kernels and positive definite kernels 
as given below : 



Theorem 1 ( (jBerg et al.l . Il984l ). Chapter 3, Lemma 2.1). For any given kernel N over fi, and 
some fixed G Jl, define a new kernel K as i^(x,y) = i (iV(x, 0) + iV(y, 0) - iV(x,y) - iV(0,0)). 
Alternatively if we have N{0,0) > and define K{x,y) = i (A^(x, 0) + A^(y,0) - 7V(x,y)). Then 
N is negative definite iff K is positive definite. 

Using the above expression for i^(x, y), it is simple to arrive at the following relation 

(iV(x, x) + iV(y, y) - 2iV(x, y)) = K(x, x) + K{y, y) - 2i^(x, y) 

In case is a squared Hilbertian metric satisfying identity of the indiscernibles (i.e. A^(x, x) = 
for all X G il), we can obtain the following relation A^(x,y) = i^(x,x) + i^(y,y) — 2i^r(x, y). This 
allows us to arrive at the following simple but useful corollary: 
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Corollary 2. Given a distance function D over Q satisfying identity of the indiscernibles and 
some fixed G fi, define a kernel K as -ftr(x,y) = ^ (Z)^(x, 0) + L'^(y,0) — D^(x, y)). Then D is 
Hilbertian iff K is positive definite. 

Proof. To see that positive definiteness of K is necessary, simply invoke Theorem [1] along with 
the fact that since D is a metric, Z)^(x, x) > for all x G and that all squared Hilber- 
ti an met r ics ar e negative definite. To see that i s sufficient, simply invoke Mercer's theorem 
f dMerceil . 119091 ) or equivalently the discussion in teerg et al.l Il984l ). Chapter 3, Section 3) to 



confirm the existence of a mapping ip : Vt ^ such that X(x, y) = ((/?(x), (/3(y))^. Thus we have 
||(/9(x) — (f{y) 11^ = i^(x, x) + K{y, y) — 2K{x, y). Next, using the definition of K, we conclude that 
K(x,x)+i^(y,y) -2K(x,y) = D^{x,y) - ^ {D^{x,x) + D^{y,y)) = D^{x,y) since D satisfies 
identity of the indiscernibles. This gives us ||9?(x) — 9?(y)||^^ = -D^(x,y) thus proving that D is in 
fact a Hilbertian metric. □ 

The preceding corollary and its proof give us t he well known co rrespondence between the set 
of positive definite kernels and Hilbertian metrics ( Scholkopi . 20001 ). Given any positive definite 



kernel K it is possible to construct a Hilbertian metric Dk and vice versa (using the polarization 
identity). Moreover, the preceding result holds for any arbitrary domain fi. However if translation 
invariance is considered on some locally compact Abelian (LCA) group, this symmetry breaks as 
is demonstrated by the following discussion. 

Lemma 3. Consider a domain 17 C G where G is some LGA group. Then every translation 
invariant positive definite kernel K on G yields a translation invariant Hilbertian metric D. 

Proof. By translation invariance of K we have for all x, y,t G G, i^(x + t,y + t) = i^(x, y). 
Thus, for the corresponding Hilbertian metric L'^(x,y) = i^(x, x) + i^(y,y) — 2i^(x, y), we have 
D2(x + t, y + t) = 2K(0, 0) + i<C(x + t, y + t) = 2K{Q, 0) + K{x, y) = D^{x, y). □ 

However the converse does not hold true: take for example the familiar inner product on some 
finite dimensional Euclidean space R*^ as the positive definite kernel. Its corresponding Hilbertian 
metric is the usual Euclidean metric ||-||2- However, whereas the metric (i.e. ||-||2) is translation 
invariant, the kernel (i.e. the inner product) is clearly not. In the sequel we shall explore this 
asymmetry by taking the example of the LCA group M. 



2 Screw Functions and Positive Definite Kernels 



Von Neumann and Schoenberg) (1 19411 ) initiated an investigation that resulted in a complete char- 
acterization of translation invariant Hilbertian metrics on the real line. Recall that a metric D 
defined on some domain il. is said to be Hilbertian if there exists a rnap £ : ^ such that for all 
x,y € -D(x,y) 



£(y)|L. Ivon Neumann and Schoenbere referred to the maps £ 



that generated these metrics as screw function]^. Subsequently ther e were general izations of this 
work to the more general class of homogeneous Hilbertian metrics (iFug However 



we 



IVon Neumann and Schoento3 (1194111 also completely characterized transforms of Euclidean metrics on finite 
dimensional spaces that are embeddable in S), i.e. functions of the kind F : — > such that -D(x, y) = -F(|[x — y ||) 
is a Hilbertian metric, thus completing a sequence of results that had previously characterized Hilbertian metric 
transforms for f). It is important to note that only for R do these provide a complete characterization of all translation 
invariant Hilbertian metrics. 
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shall refrain from expanding our scope beyond t ranslation invariant Hilbertian i netric s on the real 
line. Below we restate the first main theorem of von Neumann and Schoenberg ( 194ll ): 



Theorem 4 ( (jvon Neumann and Schoenbergj . Il94lh . Theorem 1). All and only those translation 
invariant metrics on M that can be expressed as the following Stieltjes integral are Hilbertian: 



D\x,y) = D\x-y) 



°° sin2 



t{x - y) 



t2 



dl{t) 



where ^{t) is a non- decreasing function on 



such that 



t ^d7(t) exists. 



Our main observation is that since translation invariant kernels all correspond to translation 
invariant Hilbertian metrics, the above result can be used to arrive at a characterization of transla- 
tion invariant kernels on the real line as well. This result is famously known as Bochner's theorem 
(iRudinl - fliii ). As a historical note we recall that although today Bochner's theorem is taken to 
be the result for genera l local ly compact Abelian groups, it wa s original l y prov ed for the group of 
integers Z by iHerglotd (|l91lh and for the group of rea l s R b y iBochnej (|l933l ) . The extension to 
locally compact Abelian groups is actually due to Weil ( 19381 ). Bochner was the first to recognize 
the key role this result plays in harmonic analysis. 

However, note that it is not possible to work thi ngs the other way round, i.e. u s e Boc hner's 
theorem as a starting point to arrive at the result of Ivon Neumann and SchoenbergI (jl94lh since 
not all translation invariant metrics on the real line correspond to translation invariant kernels as 
discussed above. This in some sense shows that, at least for the case of the real line, the result of 
von Neumann and Schoenbergj (Il94ll ) is a more general one than Bochner's theorem and hence the 
two are not equivalent. 

However, Bochner's theorem does not follow trivially from the above discussion and the proof 
progression requires overcoming a few technical hurdles. Most importantly, one is require to qualify 
Theorem [4] with other preconditions since the relation between translation invariant kernels and 
Hilbertian metrics is not that of an exact bijection. 

In the subsequent discussion we will, for sake of convenience, assume that the kernel K is 
positive and normalized i.e. K(x, y) G [0, 1] for all x, y. This restriction will allow us to present 
the main ingredients of the proof without being bogged down by stray factors. We shall later relax 
this condition to let K assume arbitrary real values (positive as well as negative) in Appendix Rl 
Our aim shall be to establish Bochner's theorem in the following form: 



Theorem 5. Every real valued translation invariant positive definite kernel on 
Stieltjes integral of some bounded positive Borel measure on M. 



is the Fourier 



3 Proof of Theorem [5] 

Suppose we are given a translation invariant kernel K on the real line. We begin by constructing a 
distance measure D on the real line as follows (recall the relation used in the proof of Corollary [2]) : 

D^{x -y) = K{x, x) + Kiy, y) - 2K{x, y), 
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where x,y gM.. Since K is assumed to be translation invariant, D is translation invariant as well. 
Applying Theorem |4] on D, we arrive at the following equalities 



2K{0)-2Kix,y) = '-^^^-^^^^d^{t) 

1 — cos 2t{x — y) 



2t2 

dl{t) f°° cos2t{x -y) 



dj{t) 

2t^ Jo 2t^-''^'^- 
We realize that we might have committed blasphemy by this segregation of the two parts of the 

/>oo 

integral since Theorem H] does not assure us that / t~'^d^{t) exists. It only assures us that 

Jo 

/oo 
t^'^d'y{t) exists. To remedy this, we use the following simple lemma: 

Lemma 6. H < 4:K(0) 

Jo t 

Proof, von Neumann and Schoenberj ( 1941 ) characterize bounded Hilbertian metrics by proving 

that for any such metric D, lim j(t) = j(0) i.e. j(t) does not have a discrete component at and 

t^o+ 

that lim sup D^{t) = ei / t^'^d-yit) ) . More specifically, they prove 

i->oo V-'O / 

- / t-^d-fit) < limsup D'^it) < / r'^d-f{t). 

2 Jo t-^oo Jo 

In our case, we have K{t) € [0, 1] which gives us 

L>2(t) = 2K{0) - 2K{t) < 2K{0). 



This p roves that our Hilbertian metric is bounded which, upon applying the result of lvon Neumann and Schoenberg 
jjroves the claim. □ 

This justifies the segregation of the integral into two parts. Rearranging terms we get 

^ COS 2t{x - y) , , ^ 

where C = K(0) — / Note that using Lemma [H we have C > 0. Using a change of 

Jo 

variables and a rescaling of the function 7(-) (which leaves all properties of 7(-) from Theorem [J] 
unharmed) gives us the following expression: 



Kix, y) = C + \-^Jildjit) 



for some constant C > 0. 



4 



Now we transform this Stieltjes integral into a Four ier Stie l tjes in tegral to arrive at the canonical 



form of Bochner's theorem (for example as given in (iRudinl . Il962l )). For any < x < oo, define 

a{x) := / t~'^d'y{t) which gives us 
Jo 



oo 



K{x,y) = C+ / cos t{x — y)da{t). 







Note that a (■) is a non-decreasing right continuous function. Now define, for any interval S = 
(a, 6] C M, the measure /i as follows 

a(6)-(l-2-l,<o)«(a) , C 
■= 2 ^ 2" ' 

An application of Caratheodory's extension theorem allows us to extend fi to a Borel measure over 

poo 

M. Since a (•) is a non decreasing function, / t~'^dj{t) < 00 and C > 0, ^ is a bounded positive 

^0 

measure and satisfies 

poo 



J —00 

This establishes Bochner's theorem on the real line as claimed. 

4 Extensions 

We note that the above characterization can also be extended to separable kernels on finite di- 

d 

mensional Euclidean spaces i.e. kernels which can be written as K{x,y) = Y\ Ki{xi,yi) where 

i=l 

Ki{'Xi,yi) is a translation invariant kernel on the i*^ dimension. The resulting characterization is 
given as the following theorem: 

Theorem 7. Every separable real valued translation invariant positive definite kernel on is the 
Fourier Stieltjes integral of some bounded positive Borel measure on M.'^. 

d 

Proof. We have K{x,y) = Yl Ki{xi,yi) where every Ki is a translation invariant kernel. Using 

1=1 

Theorem m there exist some positive bounded Borel measures fii,. . . ,fin on M such that for all 
i = 1, . . . ,d, we have 

poo 

Thus we have 



00 



poo 

K(x,y) = llK,{^,,y,)=ll e^*'(--y^)dA..(t.) 
i=i i=i'^-°° 

/oo poo / \ / \ /" 

d 

where ^ = /ii x . . . x is a bounded positive product measure and we have used Fubini's theorem 
in the fourth step. □ 
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This characterization apphes to some of the most widely used translation kernels ([Rahimi and Recht 



20071 ) given below 



II ii2 I \2 

1. Gaussian kernel i^(x, y) = e 2 =lle 2 



i=l 



2. Laplacian kernel i^(x, y) = e H'' ''Hi = JJe 



1=1 

2 



3. Cauchy kernel -fC(x, y) = IT „ • 

r=i 1 + - yi) 

A Generalizing the proof for arbitrary kernels 

We now assume no normalizations on i.e. we allow for all x, y, K[x, y) € M. What this lack 
of normalization (more specifically, the ability of the kernel to take negative values) hinders is the 

ability to use K{x,y) > to show that C = K(0) — / > in the proof. Thus we are left, 

Jo 

at the end of the day, with the following characterization: 

K{x,y)= / e'"^'^-yUf,{t)+C, 



00 



where /i is a bounded positive measure and C is some (possibly negative) real constant. We first 
proceed by absorbing this constant into the measure fi by defining a new measure jl. For any set 
S = (a, b] C M, define 

KS) :=^(S')+C-lo6(a,6]- 

which gives us 

poo 

K{x,y)= / e^*(^-^)(i/i(t). 



—00 



Notice that this might have caused fi to acquire a discrete component at (notice that due to the 
property of 7(-) that lim jU) = 7(0), ^ does not have a discrete component at to begin with). 

What we will show that unless this discrete component is non-negative, K cannot be a positive 
definite kernel which will establish Bochner's theorem. 

This we shall show by proving that in case C 7^ (if C = then we are done), then /i(0) is 
actually an eigenvalue of K. This will imply that for K to be positive semi-definite, /x(0) > which 
shall complete our proof. 

Lemma 8. In case the modified distribution jl has a discrete component jl{0) at then /2(0) is an 
eigenvalue of K . 

Proof. For simplicity (and also without loss of generalization), assume that the domain of K is the 
unit interval [—1, 1] so that the kernel is a function K : [—1, 1] x [—1, 1] — )■ M (note that the kernel 
is still being allowed to take arbitrary real values). The purpose of proposing a compact domain is 
to allow us to work with constant functions /(x) = c and still have f{x) € with respect to the 
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Lebesgue measure. Consider the integral operator Tk on P corresponding to the kernel K defined 
as follows: 



/oo 
f{y)K{x,y)dy. 
-oo 



We will show below that the constant function is an eigenfunction of this integral operator. Taking 
f{x) = ewe get 



oo /•oo /•oo 



TKif){x) = / f{y)K{x,y)dy= c e^'^^-y^ dfi{t)dy 

J —OO J —CO J —oo 

/oo / f'OO \ POO 

-oo \J—oo J J —oo 

where 5{x) is the Dirac delta function and we have used Fubini's theorem in the third step. This 
proves the claim. □ 
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